Spectral Sequence Motif Discovery

نویسندگان

  • Nicolò Colombo
  • Nikos Vlassis
چکیده

Sequence discovery tools play a central role in several fields of computational bi-ology. In the framework of Transcription Factor binding studies, motif finding algorithms ofincreasingly high performances are required to process the big datasets produced by new high-throughput sequencing technologies. Most existing algorithms are computationally demandingand often cannot support the large size of new experimental data. We present a new motif dis-covery algorithm that is built on a recent machine learning technique, referred to as Method ofMoments. Based on spectral decompositions, this method is robust under model misspecificationand not prone to locally optimal solutions. We obtain an algorithm that is extremely fast anddesigned for the analysis of big sequencing data. In few minutes, we can analyse datasets ofmore than hundred thousand sequences and produce motif profiles that match those computed byvarious state-of-the-art algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of an Efficient Hybrid Method for Motif Discovery in DNA Sequences

This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...

متن کامل

Visual Motif Discovery via First-Person Vision

Visual motifs are images of visual experiences that are significant and shared across many people, such as an image of an informative sign viewed by many people and that of a familiar social situation such as when interacting with a clerk at a store. The goal of this study is to discover visual motifs from a collection of first-person videos recorded by a wearable camera. To achieve this goal, ...

متن کامل

Efficient Algorithms for Model-Based Motif Discovery from Multiple Sequences

We study a natural probabilistic model for motif discovery that has been used to experimentally test the quality of motif discovery programs. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet Σ. A motif G = g1g2 . . . gm is a string of m characters. Each background sequence is implanted a randomly generated approx...

متن کامل

DREME: motif discovery in transcription factor ChIP-seq data

MOTIVATION Transcription factor (TF) ChIP-seq datasets have particular characteristics that provide unique challenges and opportunities for motif discovery. Most existing motif discovery algorithms do not scale well to such large datasets, or fail to report many motifs associated with cofactors of the ChIP-ed TF. RESULTS We present DREME, a motif discovery algorithm specifically designed to f...

متن کامل

Genetic Algorithm Based Probabilistic Motif Discovery in Unaligned Biological Sequences

Finding motif in biosequences is the most important primitive operation in computational biology. There are many computational requirements for a motif discovery algorithm such as computer memory space requirement and computational complexity. To overcome the complexity of motif discovery, we propose an alternative solution integrating genetic algorithm and Fuzzy Art machine learning approaches...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1407.6125  شماره 

صفحات  -

تاریخ انتشار 2014